27 research outputs found
Factorial graphical lasso for dynamic networks
Dynamic networks models describe a growing number of important scientific
processes, from cell biology and epidemiology to sociology and finance. There
are many aspects of dynamical networks that require statistical considerations.
In this paper we focus on determining network structure. Estimating dynamic
networks is a difficult task since the number of components involved in the
system is very large. As a result, the number of parameters to be estimated is
bigger than the number of observations. However, a characteristic of many
networks is that they are sparse. For example, the molecular structure of genes
make interactions with other components a highly-structured and therefore
sparse process.
Penalized Gaussian graphical models have been used to estimate sparse
networks. However, the literature has focussed on static networks, which lack
specific temporal constraints. We propose a structured Gaussian dynamical
graphical model, where structures can consist of specific time dynamics, known
presence or absence of links and block equality constraints on the parameters.
Thus, the number of parameters to be estimated is reduced and accuracy of the
estimates, including the identification of the network, can be tuned up. Here,
we show that the constrained optimization problem can be solved by taking
advantage of an efficient solver, logdetPPA, developed in convex optimization.
Moreover, model selection methods for checking the sensitivity of the inferred
networks are described. Finally, synthetic and real data illustrate the
proposed methodologies.Comment: 30 pp, 5 figure
Inferring networks from high-dimensional data with mixed variables
We present two methodologies to deal with high-dimensional data with mixed variables, the strongly decomposable graphical model and the regression-type graphical model. The first model is used to infer conditional independence graphs. The latter model is applied to compute the relative importance or contribution of each predictor to the response variables. Recently, penalized likelihood approaches have also been proposed to estimate graph structures. In a simulation study, we compare the performance of the strongly decomposable graphical model and the graphical lasso in terms of graph recovering. Five different graph structures are used to simulate the data: the banded graph, the cluster graph, the random graph, the hub graph and the scale-free graph. We assume the graphs are sparse. Our finding, in the simulation study, is that the strongly decomposable graphical model shows, generally, comparable or better performance both in low and high-dimensional case. Finally, we show an application on mixed data
INFERRING GENE NETWORKS FROM MICROARRAY WITH GRAPHICAL MODELS
ABSTRACT. Microarray technology allows to collect a large amount of genetic data, such as gene expression data. The activity of the genes are coordinate by a complex network that regulates their expressions controlling common functions, such as the formation of a transcriptional complex or the availability of a signalling pathway. Understanding this organization is crucial to explain normal cell physiology as well as to analyse complex pathological phenotypes. Graphical models are a class of statistical models that can be used to infer gene regulatory networks. In this paper, we examine a class of graphical models: the strongly decomposable graphical models for mixed variables. Among oth- ers properties, explicit expressions of maximum likelihood estimators are available for decomposable graphical models. This property makes the use of decomposable model suitable for high-dimensional data. We apply decomposable graphical models to a real dataset example
Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks
Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order-some entries of the precision matrix are a priori zeros-or equal dependency strengths across time lags-some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l1-penalized maximum likelihood, imposing a further constraint on the absolute value of its entries, which results in sparse networks. Selecting the optimal sparsity level is a major challenge for this type of approaches. In this paper, we evaluate the performance of a number of model selection criteria for fGGMs by means of two simulated regulatory networks from realistic biological processes. The analysis reveals a good performance of fGGMs in comparison with other methods for inferring dynamic networks and of the KLCV criterion in particular for model selection. Finally, we present an application on a high-resolution time-course microarray data from the Neisseria meningitidis bacterium, a causative agent of life-threatening infections such as meningitis. The methodology described in this paper is implemented in the R package sglasso, freely available at CRAN, http://CRAN.R-project.org/package=sglasso
Approximate Bayesian Computation for Forecasting in Hydrological models
Approximate Bayesian Computation (ABC) is a statistical tool for handling
parameter inference in a range of challenging statistical problems, mostly
characterized by an intractable likelihood function. In this paper, we focus on the
application of ABC to hydrological models, not as a tool for parametric inference,
but as a mechanism for generating probabilistic forecasts. This mechanism is referred
as Approximate Bayesian Forecasting (ABF). The abcd water balance model
is applied to a case study on Aipe river basin in Columbia to demonstrate the applicability
of ABF. The predictivity of the ABF is compared with the predictivity of the
MCMC algorithm. The results show that the ABF method as similar performance
as the MCMC algorithm in terms of forecasting. Despite the latter is a very flexible
tool and it usually gives better parameter estimates it needs a tractable likelihoo
Tourism statistics
How a nation collects and shares data on tourism in the country is influenced by that
country’s historical, cultural, and political background, as well as its geographical
characteristics. This means there are a wide variety of ways tourism data are collected,
evaluated, and disseminated. Although standard statistical mechanisms for tracking tourism
have been improved, to be of greatest use, statistical sources should be further enhanced
and be comparable across locations and time. For this reason, several international
organizations have the responsibility of harmonizing definitions of, and methodologies for,
collecting data. These organizations are also a relevant supplier of data sources, working to
provide consistent tourism statistics
Operational and financial performance of Italian airport companies: A dynamic graphical model
This paper provides evidence on the relationship within a set of financial and operational indicators for Italian airports over 2008\u20132014. The limited sample size of national and regional airports suggests to apply the penalised RCON (V, E) model, which falls within the class of Gaussian graphical models. It provides both estimate and easy way to visualise conditional independence structures of the variables. Moreover, it is particularly suitable for handling longitudinal data where small number of units and huge number of variables have been collected. Findings highlight that a qualified concept of size matters in determining good financial performance. Specifically, increasing jointly the number of movements with flights that would attract a high number of passengers may improve both sales profitability and revenues generated by the company's assets. Results suggests that the effect of low cost carrier has been heterogeneous throughout the sample, which may suggest new opportunities to expand the business in order to intercept the consumer surplus of this category of travellers